179 research outputs found

    Parallel cellular programming skeleton

    Get PDF
    Cellular automata provide an abstract model of parallel com- putation that can be effectively used for modeling and simulation of com- plex phenomena and systems. The design and implementation of parallel programming languages based on skeletons simplify the design, test and code of parallel algorithms. We discuss here the main characteristics of cellular automata programming models and we show a cellular automata skeleton for the exploitation on the inherent parallelism of problems of this kind. As a practical example, we show how to solve the problem of heat equation through the skeleton.Presentado en el IX Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    Performance predictability of divide and conquer skeletons

    Get PDF
    Parallel divide and conquer computations, encompassing a wide variety of applications, can be modeled and encapsulated as a high level primitive called skeleton. The paper deals with a skeleton designed for parallel divide and conquer algorithms that provide hypercubical communications among processes The paper also introduces an accurate timing model designed for prediction of proposed primitive. The timing analysis model presented here still characterizing the communication time through architecture parameters but introduces a few novelties. The proposal is to introduce different kinds of components to the analytical model by associating a performance constant for each specific conceptual block of the skeleton. The trace files obtained from the execution of the resulting code using the skeleton are used by lineal regression techniques giving us, among other information, the values of the parameters of those blocks. An extended example showing the relative accuracy of the proposed approach concludes the paper.Workshop de Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    Towards a high performance cellular automata programming skeleton

    Get PDF
    Cellular automata provide an abstract model of parallel computation that can be effectively used for modeling and simulation of complex phenomena and systems. In this paper, we start from a skeleton designed to facilitate faster D-dimensional cellular automata application development. The key for the use of the skeleton is to achieve an efficient implementation, irrespective of the application specific details. In the parallel implementation on a cluster was important to consider issues such as task and data decomposition. With multicore clusters, new problems have emerged. The increasing numbers of cores per node, caches and shared memory inside the nodes, has led to the formation of a new hierarchy of access to processors. In this paper, we described some optimizations to restructuring the prototype code and exposing an abstracted view of the multicore cluster to the high performance CA application developer. The implementation of lattice division functions establishes a partnership relation among parallel processes. We propose that this relation can efficiently map on the multicore cluster communicational topology. We introduce a new mapping strategy that can obtain benefit in the performance by adapting its communication pattern to the hardware affinities among processes allocated in different cores. We apply our approach to a two-dimensional application achieving sensible execution time reduction.Presentado en el X Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    Towards a high performance cellular automata programming skeleton

    Get PDF
    Cellular automata provide an abstract model of parallel computation that can be effectively used for modeling and simulation of complex phenomena and systems. In this paper, we start from a skeleton designed to facilitate faster D-dimensional cellular automata application development. The key for the use of the skeleton is to achieve an efficient implementation, irrespective of the application specific details. In the parallel implementation on a cluster was important to consider issues such as task and data decomposition. With multicore clusters, new problems have emerged. The increasing numbers of cores per node, caches and shared memory inside the nodes, has led to the formation of a new hierarchy of access to processors. In this paper, we described some optimizations to restructuring the prototype code and exposing an abstracted view of the multicore cluster to the high performance CA application developer. The implementation of lattice division functions establishes a partnership relation among parallel processes. We propose that this relation can efficiently map on the multicore cluster communicational topology. We introduce a new mapping strategy that can obtain benefit in the performance by adapting its communication pattern to the hardware affinities among processes allocated in different cores. We apply our approach to a two-dimensional application achieving sensible execution time reduction.Presentado en el X Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    A low communication overhead parallel implementation of the back-propagation algorithm

    Get PDF
    The back-propagation algorithm is one of the most widely used training algorithms for neural networks. The training phase of a multilayer perceptron by using this algorithm can take very long time making neural networks difficult to accept. One approach to solve this problem consists in the parallelization of the training algorithm. There exists many different approaches, however most of them are well adapted to specialized hardware. The idea to use a network of workstations as a general purpose parallel computer is widely accepted. However, the communication overhead imposes restrictions in the design of parallel algorithms. In this work, we propose a parallel implementation of the back-propagation algorithm that is suitable to be applied to a network of workstations. The objective is twofold. The first goal is to increment the performance of the training phase of the algorithm with low communication overhead. The second goal is to provide a dynamic assignment of tasks to processors in order to make the best use of the computational resources.I Workshop de Agentes y Sistemas Inteligentes (WASI)Red de Universidades con Carreras en Informática (RedUNCI

    Using parallel pivot vs. clustering-based techniques for web engines

    Get PDF
    Web Engines are a useful tool for searching information in the Web. But a great part of this information is non-textual and for that case a metric space is used. A metric space is a set where a notion of distance (called a metric) between elements of the set is defined. In this paper we present an efficient parallelization of a pivot-based method devised for this purpose which is called the Sparse Spatial Selection (SSS) strategy and we compare it with a clustering-based method, a parallel implementation of the Spatial Approximation Tree (SAT). We show that SAT compares favourably against the pivot data structures SSS. The experimental results were obtained on a highperformance cluster and using several metric spaces, that shows load balance parallel strategies for the SAT. The implementations are built upon the BSP parallel computing model, which shows efficient performance for this application domain and allows a precise evaluation of algorithms.VIII Workshop de Procesamiento Distribuido y ParaleloRed de Universidades con Carreras en Informática (RedUNCI

    Using parallel pivot vs. clustering-based techniques for web engines

    Get PDF
    Web Engines are a useful tool for searching information in the Web. But a great part of this information is non-textual and for that case a metric space is used. A metric space is a set where a notion of distance (called a metric) between elements of the set is defined. In this paper we present an efficient parallelization of a pivot-based method devised for this purpose which is called the Sparse Spatial Selection (SSS) strategy and we compare it with a clustering-based method, a parallel implementation of the Spatial Approximation Tree (SAT). We show that SAT compares favourably against the pivot data structures SSS. The experimental results were obtained on a highperformance cluster and using several metric spaces, that shows load balance parallel strategies for the SAT. The implementations are built upon the BSP parallel computing model, which shows efficient performance for this application domain and allows a precise evaluation of algorithms.VIII Workshop de Procesamiento Distribuido y ParaleloRed de Universidades con Carreras en Informática (RedUNCI

    Simulating Behaviours to Face up an Emergency Evacuation

    Get PDF
    Computer based models describing pedestrian behavior in an emergency evacuation play a vital role in the development of active strategies that minimize the evacuation time when a closed area must be evacuated. The reference model has a hybrid structure where the dynamics of fire and smoke propagation are modeled by means of Cellular Automata and for simulating people's behavior we are using Intelligent Agents. The model consists of two sub-models, called environmental and pedestrian ones. As part of the pedestrian model, this paper concentrates in a methodology that is able to model some of the frequently observed human?s behaviors in evacuation exercises. Each agent will perceive what is happening around, select the options that exist in that context and then it makes a decision that will reflect its ability to cope with an emergency evacuation, called in this work, behavior. We also developed simple exercises where the model is applied to the simulation of an evacuation due to a potential hazard, such as fire, smoke or some kind of collapse.Fil: Tissera, Pablo Cristian. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo En Inteligencia Computacional; ArgentinaFil: Castro, Alicia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo En Inteligencia Computacional; ArgentinaFil: Printista, Alicia Marcela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico San Luis; Argentina. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; ArgentinaFil: Luque, Emilio. Universitat Autonoma de Barcelona; Españ

    Effective Use of Multicore Clusters in Parallel Cellular Automata

    Get PDF
    Cellular automata provide an abstract model of parallel computation that can be effectively used for modeling and simulation of complex phenomena and systems. We start from a template designed to facilitate faster D-dimensional cellular automata application development. The key for the use of the template is to achieve an efficient implementation, irrespective of the application specific details. In the parallel implementation on a cluster was important to consider issues such as task and data decomposition. With multicore clusters, new problems have emerged. The increasing numbers of cores per node, caches and shared memory inside the nodes, has led to the formation of a new hierarchy of access to processors. In this work we discuss and evaluate strategies that will be important in optimizing prototype to run on multicore cluster. The underlying idea in our proposal is the establishment of a relation among parallel processes based on the communication topology that arises in the implementation of task division functions. We propose that this relation can efficiently map on the multicore cluster topology. We introduce a new mapping strategy that can obtain benefit in the performance by adapting its communication pattern to the hardware affinities among processes allocated in different cores. We apply our approach to a two-dimensional application achieving sensible execution time reduction.Sociedad Argentina de Informática e Investigación Operativ

    Towards a high performance cellular automata programming skeleton

    Get PDF
    Cellular automata provide an abstract model of parallel computation that can be effectively used for modeling and simulation of complex phenomena and systems. In this paper, we start from a skeleton designed to facilitate faster D-dimensional cellular automata application development. The key for the use of the skeleton is to achieve an efficient implementation, irrespective of the application specific details. In the parallel implementation on a cluster was important to consider issues such as task and data decomposition. With multicore clusters, new problems have emerged. The increasing numbers of cores per node, caches and shared memory inside the nodes, has led to the formation of a new hierarchy of access to processors. In this paper, we described some optimizations to restructuring the prototype code and exposing an abstracted view of the multicore cluster to the high performance CA application developer. The implementation of lattice division functions establishes a partnership relation among parallel processes. We propose that this relation can efficiently map on the multicore cluster communicational topology. We introduce a new mapping strategy that can obtain benefit in the performance by adapting its communication pattern to the hardware affinities among processes allocated in different cores. We apply our approach to a two-dimensional application achieving sensible execution time reduction.Presentado en el X Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI
    corecore